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Abstract 


Potholes pose a threat to driver safety and can lead to traffic accidents, emphasizing the 
need for efficient and proactive pothole management in complex road environments. This 
not only secures driver safety but also contributes to accident prevention and the smooth 
flow of traffic. This paper addresses the pothole detection challenge by employing an 
Enhanced Faster RCNN algorithm, which integrates Faster RCNN with EfficientNet. The 
focus is on improving detection performance, especially for autonomous cars in third- 
world countries like Nigeria. Using Lightweight Faster R-CNN with EfficientNet, the 
model exhibits outstanding performance and achieved an accuracy rate of 97.7%, 
surpassing other CNN architectures like MobileNet V3, ResNet 50, VGG16, and Inception 
V3. The results underscore the model's effectiveness for real-time decision-making in 
pothole detection, indicating its potential practical application in third-world countries, 
offering a valuable solution for safer and more efficient navigation on potholed roads. 


Keywords: Pothole detection, Accident Prevention, Autonomous Vehicle Systems (AVS), 
Lightweight Faster RCNN 


1. Introduction 


The recent advancements in artificial intelligence have paved the way for the 
development of Autonomous Vehicle Systems (AVS) in the automotive industry. However, 
ensuring the safety of human lives in real road conditions remains a challenge for these 
industries. The World Health Organization has reported high road death rates, which are 
mainly caused by casual driving and a lack of understanding of the driving scene. To 
address this, researchers have explored the use of deep learning and computer vision 
techniques for the understanding of the driving scene, which involves feature extraction, 
classification, detection, and tracking. 


Autonomous vehicles or self-driving cars are being heavily researched and developed by 
various technological firms and universities. The goal is to create safe and convenient 
navigation through the use of artificial intelligence and computer vision techniques. 
However, the challenges in object detection and recognition, especially in varying 
lighting and weather conditions, still need to be overcome. The use of deep learning and 
computer vision in understanding the driving scene is one of the ways to address these 
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challenges. The detection of road conditions, such as potholes, is also an important 
aspect for autonomous vehicle systems. Deep learning with computer vision has the 
potential to provide affordable and robust solutions for autonomous driving. One example 
of this is the use of a deep learning-based solution for Potholes detection in self-driving 
cars, which outperforms traditional image processing methods. 


The detection of road conditions, such as potholes, is also a crucial aspect of an AVS. 
Various approaches have been taken to detect potholes, including the use of thermal 
cameras and optical cameras. The existing methods, however, have limited performance 
and are either expensive or complex. A simpler and low-cost vision-based Deep Learning 
model has been proposed to improve the performance of pothole detection for AVS and 
demonstrate its driving intelligence in avoiding potholes to prevent accidents. However, 
the intelligence of the AVS in making decisions based on the detected potholes remains 
to be explored. Alhaji et al. (2022) and Oyekanmi and Ejem (2022) have highlighted the 
prevalence of extensive potholes and depressions on roads, particularly in certain regions 
of the country, exacerbating traffic challenges. 


The integration of deep learning and computer vision has the potential to provide 
affordable and robust solutions for the autonomous driving industry. The use of 
Convolutional Neural Networks (CNNs) has significantly improved the accuracy of image 
classification compared to previous methods using hand-crafted feature extractors and 
classifiers, by training high-capacity models with only a small amount of annotated 
detection data, CNNs provide better object detection performance than systems based on 
simpler features. In the case of Potholes detection in self-driving cars, a deep learning- 
based solution has been presented that reliably detects and recognizes Potholes, 
outperforming traditional image processing methods. The use of light weight deep 
learning techniques such as Lightweight Faster R-CNN performs faster than other 
techniques (Gayathri & Thangavelu, 2021). 


The use of autonomous cars for Pothole detection is still in its early stages. While the 
technology for autonomous vehicles has advanced significantly, the infrastructure, road 
conditions, and regulations are still in need of improvement to make the use of 
autonomous Cars a reality. Implementing autonomous vehicles will require investment in 
road upgrades, technology adaptation, and the development of clear regulations for 
autonomous driving. It may take some time for the technology to be widely adopted, but 
the benefits of increased safety, efficiency, and reduced human error in driving are likely 
to drive the implementation of autonomous cars in the country. 


2. Related Works 


In recent years, Computer Vision-based Approaches for road pothole detection involve 
using vehicle-mounted cameras and applying image processing techniques such as edge 
detection, blob analysis, and deep learning algorithms. These approaches offer flexibility 
and robustness in varied road and lighting conditions. Classical 2-D image processing, a 
traditional technique, is widely used in pothole detection, but it faces limitations in 
handling real-world complexities, sensitivity to image quality, and requires additional 
techniques for improved performance. Ma et al. (2022) applied a computer vision for 
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road imaging and pothole detection: a state-of-the-art review of systems and algorithms. 
2-D image processing, the limitations include sensitivity to image quality and difficulty in 
handling occlusions and viewpoint changes. Additionally, these methods are often limited 
in their ability to handle variability in image conditions such as illumination changes, 
camera distortions, and background clutter. Classical 2-D image processing for pothole 
detection has limitations in dealing with real-world complexities, sensitivity to image 
quality, need for manual feature extraction, limited generalization, and requirement of 
additional techniques to improve performance. 


Sensor-based Approaches for road pothole detection utilize smart sensors and actuators 
in vehicles and infrastructure, gathering data to enhance automated driving. These 
approaches rely on on-board sensors like LiDAR, radar, cameras, and external sensors to 
perceive surroundings. Sensors such as LIDAR and radar are employed to detect and 
identify potholes, providing accurate results. These sensors are expensive and require 
advanced sensor setups compared to computer vision-based methods. 


Camera and laser 


Reflected wave 


Pothole 


Figure 1: Camera and Laser sensor for pothole detection on an autonomous car (Source: 
Vupparaboina et al. , 2015) 


Bosi et al. (2019) proposed the use of the Virtual Sensor Concept for Pothole Detection in 
Vehicle of things. The Virtual Sensor was used for pothole detection in autonomous 
vehicles to inform the car about road conditions, improving navigation and potentially 
providing road maintenance information. 


Raja et al. (2022) demonstrated the development of a smart pothole-avoidance strategy 
for autonomous vehicles. According to the study, the model showed remarkable results in 
navigating potholes, which suggests its potential effectiveness in improving the safety 
and reliability of autonomous vehicles. 


The Virtual Sensor Concept for Pothole Detection faces challenges, including the need for 
accurate sensor data, real-time processing, reliable communication, and limitations in 
certain environments. Issues such as poor sensor data quality, computational power 
constraints, disrupted communication, and difficulty in adverse weather conditions may 
impact accuracy, limiting its effectiveness and necessitating additional solutions for these 
challenges. 
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Deep Convolutional Neural Networks (DCNNs) have become a popular approach for road 
pothole detection due to recent advancements in machine/deep learning. There are three 
main types of DCNNs used for this task: image classification networks, object detection 
networks, and semantic segmentation networks. 


Image classification networks are trained to classify road images as positive (pothole) or 
negative (non-pothole). Object detection networks are trained to detect road potholes at 
the instance level. Semantic segmentation networks are trained to segment road images 
into pixel-level or semantic-level representations for road pothole detection. 


These networks are trained using back-propagation with a large amount of human- 
annotated road data. Data-driven Road pothole detection is a popular and effective 
approach because it does not rely on explicit parameters for road image or point cloud 
segmentation. 


Before deep learning technology exploded, researchers typically used classical image 
processing algorithms to generate hand-crafted visual features and trained a Deep 
learning model to classify road image patches. The most representative is the CNN based 
models. 


Ahmed (2021) developed a Smart pothole detection using deep learning based on dilated 
convolution. The Smart pothole detection using deep learning based on dilated 
convolution has the potential to provide accurate and robust road pothole detection, but 
also has limitations in terms of scalability and dependence on training data quality and 
computational cost. 


Kharel and Ahmed (2021) studied the use of Inverse Perspective Mapping and CNNs for 
real-time potholes detection and area estimation using image processing. The study 
showed that while CNNs can detect potholes with adequate accuracy, they have 
drawbacks including the need for large amounts of labelled training data and 
computational requirements, as well as variability in pothole appearance which affects 
accuracy. To overcome these challenges, additional techniques such as data 
augmentation and transfer learning are required. 


Yik et al. (2021) proposed a real-time pothole detection based on deep learning approach. 
The deep learning detection with YOLOv3 algorithm was used. The real-time pothole 
detection based on deep learning approach has the potential to provide real-time and 
accurate road pothole detection, but also has limitations in terms of scalability and 
dependence on training data quality and image quality. 


Dutta and Chakraborty (2020) used a novel Convolutional neural network-based model 
for an all-terrain driving autonomous car. Convolutional Neural Network Model has the 
potential to provide a solution for autonomous driving in all terrains, it also has 
limitations in terms of robustness, dependence on training data, generalizability, and 
complexity. 


Feng et al. (2022) applied a Segmentation of Road Potholes with Multimodal Attention 
Fusion Network for Autonomous Vehicles. Segmentation of Road Potholes with 
Multimodal Attention Fusion Network has the potential to provide accurate, robust and 
allows for the representation of multiple features from different modalities, leading to a 
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more complete representation of the road surface. Road pothole detection, but also has 
limitations in terms of scalability and dependence on training data quality and 
computational cost. 


Gupta and Dixit (2022) used a hybrid of FactorNet into Faster R-CNN for potholes 
detection for autonomous cars. Hybrid FactorNet into Faster R-CNN for Pothole 
Detection has the potential to provide accurate and efficient road pothole detection, but 
also has limitations in terms of complexity, increased computational cost, and 
dependence on training data. 


Kortmann et al. (2022) designed a pothole detection in an end-to-end system for 
autonomous driving using low-cost pre-installed sensors for real-time road damage and 
damage severity detection as well as cloud- and crowd-based HD Feature Maps and light 
weight deep learning technique. End-to-End System has the potential to provide real-time 
and accurate road pothole detection, but also has limitations in terms of scalability, 
dependence on sensor quality, and reliance on crowd-sourced data. 


Ma et al. (2022) employed classical 2-D image processing-based and 3-D point cloud 
modelling and convolutional neural networks (CNNs). The hybrid approach of combining 
classical 2D image processing and 3D point cloud modeling with Convolutional Neural 
Networks (CNNs) offers advantages such as improved accuracy and robustness, but also 
presents challenges like increased complexity and computational cost, as well as higher 
data requirements. The use of a hybrid approach has the potential to improve road 
pothole detection but may also require more resources and expertise. 


Manalo et al. (2022) proposed a Transfer Learning-Based System of Pothole Detection in 
Roads through Deep Convolutional Neural Networks. The Transfer Learning-Based 
System of Pothole Detection has strengths in terms of improved accuracy, efficient 
training, and robustness. However, it also has limitations in terms of dependence on pre- 
trained models, limited adaptation, and complexity. 


The limitations of deep learning in autonomous car pothole detection encompass issues 
like data quality dependence, overfitting, computational complexity, privacy concerns, 
challenges with unstructured data, and limited model generalization. Chen et al. (2022) 
propose an effective method using EfficientNet B4 for thermal image analysis, offering 
lower computational complexities and high accuracy. EfficientNet architecture is 
recognized for its efficiency, making it suitable for resource-constrained platforms. 
Utilizing it as the backbone for Lightweight Faster R-CNN in pothole recognition 
promises high accuracy and efficiency. achieving lightweight and efficient neural 
networks in applications like autonomous cars (Tang et al.,2021; Jebamikyous, & 
Kashef ,2022). Current studies often lack model generalization and may exhibit bias, 
neglecting road conditions in developing countries. Addressing this, the study aims to 
create a dataset reflecting Nigerian road conditions, using it to train a Potholes 
Recognition model with Lightweight Faster R-CNN and EfficientNet for robust 
navigation. 

According to Wang and Li (2022) the EfficientNet architecture is known for its high 
accuracy and efficiency, making it a popular choice for deployment on resource- 
constrained platforms such as autonomous cars. The lightweight and lower 
computational complexity of the EfficientNet architecture makes it well suited for use in 
systems with limited processing power, such as cars. By using this architecture as the 
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backbone network for a Lightweight Faster R-CNN network for pothole recognition, it is 
possible to achieve high accuracy and efficiency in real-time pothole detection for 
autonomous cars. 


2. Materials and methodologies 


In this study a Light-Weight Faster R-CNN is a type of object detection network that is 
used for pothole recognition in autonomous cars. It is an evolution of the Faster R-CNN 
network and is designed to be more computationally efficient while maintaining high 
accuracy. The architecture of the proposed model is shown in figure 2. 


EfficientNet Region Proposal Network 


E < 
Pothole [ Depth wise | Piece wise | | Layers 


Dataset = Convolution rl Convolution | 
| Lavers J | Lavers J | 
A = Layers 
| ) 


Fully 
Connected 


Layer Regressor 


Faster R CNN Network 
Figure 2: Proposed Light Weight Faster R-CNN model (Source: Ahmed, 2021). 


The network architecture of Light-Weight Faster R-CNN consists of two main 
components: a backbone network and two task-specific branches. The backbone network 
is a convolutional neural network (CNN) that is responsible for extracting features from 
the input image. Commonly used backbone networks in the Light-Weight Faster R-CNN, 
the EfficientNet backbone network is used to generate a set of feature maps that are then 
fed into the two task-specific branches. 


The first branch is the Region Proposal Network (RPN), which is responsible for 
generating proposals or candidate regions that contain objects of interest. The RPN uses 
anchor boxes to generate these proposals and assesses their quality using a set of binary 
classification scores. The proposals with the highest scores are then passed to the second 
branch for further processing. 


The second branch is the detection branch, which is responsible for classifying the 
objects within the proposals generated by the RPN. The detection branch uses a set of 
fully connected layers to classify the objects and to generate the final bounding boxes 
that define their locations in the image. 
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The Light-Weight Faster R-CNN network is trained on a large dataset of annotated 
images, which includes both positive samples (images that contain potholes) and negative 
samples (images that do not contain potholes). The network is trained using a 
combination of supervised learning and reinforcement learning, where the goal is to 
minimize the loss function and improve the accuracy of the model. 


Once the network is trained, it can be used to perform pothole recognition in real-world 
scenarios. During inference, the network takes an input image and generates a set of 
proposals and bounding boxes that define the locations of the potholes in the image. The 
output of the network can then be used by an autonomous vehicle to safely navigate 
around potholes and avoid potential damage to the vehicle. 


In conclusion, the Light-Weight Faster R-CNN network is a powerful and efficient tool for 
pothole recognition in autonomous cars, as it combines the strengths of both CNNs and 
object detection algorithms to provide accurate and computationally efficient results. 
Using EfficientNet instead of a modified VGG16 to optimize the Faster R-Convolutional 
Neural Network offers advantages such as superior performance due to its ability to 
balance model complexity and accuracy, resulting in better efficiency and reduced 
computational demands, which are crucial for achieving lightweight and efficient neural 
networks in applications like autonomous cars (Tang et al.,2021; Jebamikyous, & 
Kashef ,2022). 


The Potholes images will be collected by using a vehicle with cameras to snap various 
potholes along Kaduna metropolis. The images will be augmented with 665 potholes 
images from COCO datasets. A total of 800 images will be collected and used for training. 
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3. Result 


The following Faster R-CNN with VGG16, Inception V3, MobileNetV2, and ResNet50, and 
Improved Faster R-CNN EfficientNet were trained in this study. The goal was to assess, 
compare their outcomes with the proposed by evaluating them based on their parameters 
performance metrics. Table 1 shows the Faster R-CNN models parameters. 


Table 1: The Faster R-CNN models parameters 
Value 
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Input Image Size: 224x224 pixels 


Number of Convolutional 
Layers 16 


Classes: 2 
Batch Size: 


Categorical Cross- 
Classification Loss: Entropy 


Smooth L1 Loss 


Table 1 outlines the key parameters for configuring a neural network model, including 
details such as the input image size (224x224 pixels), the number of convolutional layers 
(16), activation function (ReLU), anchor box sizes and ratios for the Region Proposal 
Network (RPN), RPN stride, pooling size, hidden layer units (256), the number of output 
classes (2), learning rate (0.001), batch size (16), and the number of epochs (50), along 
with the associated loss functions (Binary Cross-Entropy, Categorical Cross-Entropy, and 
Smooth L1 Loss). 


The performance of the Improved Faster R-CNN with EfficientNet model is shown in 
Table 2. The table presents the results and evaluation metrics of the Improved Faster R- 
CNN with EfficientNet model. 


Table 2: Performance results of Faster R-CNN with EfficientNet model 
Improved 
Faster R- 
CNN with 
Description | EfficientNet 


Table 2 presented performance metrics for the Improved Faster R-CNN with 
EfficientNet. The model achieved a notable Accuracy of 97.7%, showcasing its 
correctness in identifying both potholes and non-potholes, ensuring highly reliable 
detection. The Precision of 90.1% underscored the accuracy in identifying potholes, 
minimizing false positives and avoiding unnecessary interventions for non-pothole areas. 
The Recall of 89.56% highlighted the model's capability to capture a significant portion of 
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actual potholes, reducing missed detections (false negatives). The high recall ensured 
comprehensive pothole detection. The F1 Score of 89.83% provided a balanced 
assessment, indicating effective harmony between accurate pothole identification and 
minimized false classifications. 


The Improved Faster R-CNN with EfficientNet exhibited outstanding performance in 
pothole detection, making it suitable for real-time applications on roads with potholes, 
particularly in third-world countries. The model's high accuracy, precision, recall, and F1 
Score collectively contribute to its effectiveness in identifying and addressing potholes. 


Table 3 shows the Summary of the classification parameters obtained after simulations of 
the four models. The classification performance metric used were shown for each model 
in Table 4.8. 

Table 3: Model Comparison 


Improved Faster 
Inceptio RestNet5 | R-CNN with 
Description | VGG16 n V3 MobileNetV2 | 0 EfficientNet 


| Accuracy | 95.22 94.12 | 95.62 94.71 


89.41 90.32 90.62 
87.50 83.70 89.60 87.21 89.56 
88.44 84.59 89.96 88.88 89.83 


Table 3 presents the performance of various models in pothole detection. The VGG16 
model demonstrates high accuracy, precision, and recall, indicating reliable detection, 
albeit with slightly lower performance compared to EfficientNet. Inception V3 exhibits 
good accuracy but lower precision and recall, resulting in moderate overall performance. 
MobileNetV2 achieves a balanced performance across accuracy, precision, and recall, 
slightly lower than EfficientNet. RestNet50 also maintains a balanced performance but 
with slightly lower accuracy and precision compared to EfficientNet. 


The Improved Faster R-CNN with EfficientNet outshines all other models, boasting the 
highest accuracy (97.7%), precision (90.1%), recall (89.56%), and F1 Score (89.83%). 
Considering these superior metrics, it emerges as the best model for real-time pothole 
detection. Despite potential overfitting concerns, this model showcases exceptional 
performance compared to other architectures. The choice of the best model depends on 
specific application requirements and the tolerance for false positives and false 
negatives. 


4. Conclusion 

This paper focused on advancing pothole detection systems by implementing the 
improved Faster R-CNN with EfficientNet algorithm. Beginning with a thorough review 
of pothole detection systems, the study aimed to address the challenges in this domain. 
The proposed algorithm, leveraging Faster R-CNN and FEfficientNet, demonstrated 
significant advancements in accuracy, computational complexity reduction, and overall 
efficiency for real-world applications. The research methodology covered systematic data 
collection, model development, and rigorous evaluation. The developed Potholes 
Recognition model, integrating EfficientNet with Faster R-CNN, underwent 
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comprehensive evaluation against benchmarks, showcasing impressive accuracy and 
precision metrics. 

The findings underscore the model's efficacy with Accuracy of 97.7%, Precision of 92.1%, 
Recall of 90.56%, and F1 Score of 91.32%. This success contributes substantially to 
pothole detection technology, offering a practical and efficient solution for real-world 
applications and paving the way for further advancements. 


Future research opportunities include advancing neural network architectures, 
incorporating multimodal data, real-world deployment, dynamic pothole classification, 
transfer learning, autonomous vehicle integration, user feedback utilization, energy- 
efficient implementations, privacy considerations, and international collaboration. These 
directions aim to enhance pothole detection systems globally, contributing to safer and 
well-maintained road infrastructures. 
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